-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build: Ensure binaries built on Fedora 33 run on Fedoras 32 & 31 #534
build: Ensure binaries built on Fedora 33 run on Fedoras 32 & 31 #534
Conversation
822ae3e
to
3996f46
Compare
Build failed.
|
Here's a scratch build for Fedora 34: https://koji.fedoraproject.org/koji/taskinfo?taskID=49760645 I extracted the |
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. The Go implementation also mostly worked so far because it's largely statically linked, with the notable exception of the standard C library. However, recently glibc-2.32, which is used by Fedora 33 onwards, added a new version of the pthread_sigmask symbol [1] as part of the libpthread removal project: $ objdump -T /usr/bin/toolbox | grep GLIBC_2.32 0000000000000000 DO *UND* 0000000000000000 GLIBC_2.32 pthread_sigmask This means that /usr/bin/toolbox binaries built against glibc-2.32 on newer Fedoras pick up the latest version of the symbol and fail to run against older glibcs in older Fedoras. One way to fix this is to disable the use of any C code from Go by using the CGO_ENABLED environment variable [2]. However, this can negatively impact packages like "os/user" [3] and "net" [4], where the more featureful glibc APIs will be replaced by more limited equivalents written only in Go. Instead, since glibc uses symbol versioning, it's better to tell the Go toolchain to avoid linking against any symbols from glibc-2.32. This was accomplished by a few linker tricks: * The GNU ld linker's --wrap flag was used when building the Go code to divert pthread_sigmask invocations from Go to another function called __wrap_pthread_sigmask. * A static library was added to provide this __wrap_pthread_sigmask function, which forwards calls to the actual pthread_sigmask API in glibc. This library itself was not linked with --wrap, and specifies the latest permissible version of the pthread_sigmask symbol from glibc for each architecture. Currently, the list of architectures covers the ones that Fedora builds for. * The Go cmd/link linker was switched to external mode [5]. This ensures that the final object file containing all the Go code gets linked to the standard C library and the wrapper static library by the GNU ld linker for the --wrap flag to kick in. Based on ideas from Ondřej Míchal. [1] glibc commit c6663fee4340291c https://sourceware.org/git/?p=glibc.git;a=commit;h=c6663fee4340291c [2] https://golang.org/cmd/cgo/ [3] https://golang.org/pkg/os/user/ [4] https://golang.org/pkg/net/ [5] https://golang.org/src/cmd/cgo/doc.go containers#529
3996f46
to
6ad9c63
Compare
I merged this because it seems that Rawhide is pretty turbulent right now (there's also containers/podman#7389), so it's proving difficult to get people to test. Let's see how it holds up in the wild. |
I just tried on Fedora 34 and I can confirm that it works.
The error before entering the container ( The Thanks for the fix 😄 |
Thanks for the feedback, @juanje ! |
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. However, the Go implementation, can run into ABI compatibility issues because binaries built on newer toolchains aren't meant to be run against older runtimes. The previous approach [1] of restricting the versions of the glibc symbols that are linked against isn't actually supported by glibc, and breaks if the early process start-up code changes. This is seen in glibc-2.34, which is used by Fedora 35 onwards, where a new version of the __libc_start_main symbol [2] was added as part of some security hardening: $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 __libc_start_main 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_detach 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_create 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_attr_getstacksize This means that /usr/bin/toolbox binaries built against glibc-2.34 on newer Fedoras fail to run against older glibcs in older Fedoras. Another option is to make the host's runtime available inside the toolbox container and ensure that the binary always runs against it. Luckily, almost all supported containers have the host's /usr available at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to /run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing the path of the dynamic linker (ie., PT_INTERP) to the one inside /run/host. Unfortunately, there can only be one PT_INTERP entry inside the binary, so there must be a /run/host on the host too. Therefore, a /run/host symbolic link is created on the host that points to the host's /. [1] Commit 6ad9c63 containers#534 [2] glibc commit 035c012e32c11e84 https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84 https://sourceware.org/bugzilla/show_bug.cgi?id=23323 containers#821
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. However, the Go implementation, can run into ABI compatibility issues because binaries built on newer toolchains aren't meant to be run against older runtimes. The previous approach [1] of restricting the versions of the glibc symbols that are linked against isn't actually supported by glibc, and breaks if the early process start-up code changes. This is seen in glibc-2.34, which is used by Fedora 35 onwards, where a new version of the __libc_start_main symbol [2] was added as part of some security hardening: $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 __libc_start_main 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_detach 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_create 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_attr_getstacksize This means that /usr/bin/toolbox binaries built against glibc-2.34 on newer Fedoras fail to run against older glibcs in older Fedoras. Another option is to make the host's runtime available inside the toolbox container and ensure that the binary always runs against it. Luckily, almost all supported containers have the host's /usr available at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to /run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing the path of the dynamic linker (ie., PT_INTERP) to the one inside /run/host. Unfortunately, there can only be one PT_INTERP entry inside the binary, so there must be a /run/host on the host too. Therefore, a /run/host symbolic link is created on the host that points to the host's /. [1] Commit 6ad9c63 containers#534 [2] glibc commit 035c012e32c11e84 https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84 https://sourceware.org/bugzilla/show_bug.cgi?id=23323 containers#821
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. However, the Go implementation, can run into ABI compatibility issues because binaries built on newer toolchains aren't meant to be run against older runtimes. The previous approach [1] of restricting the versions of the glibc symbols that are linked against isn't actually supported by glibc, and breaks if the early process start-up code changes. This is seen in glibc-2.34, which is used by Fedora 35 onwards, where a new version of the __libc_start_main symbol [2] was added as part of some security hardening: $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 __libc_start_main 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_detach 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_create 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_attr_getstacksize This means that /usr/bin/toolbox binaries built against glibc-2.34 on newer Fedoras fail to run against older glibcs in older Fedoras. Another option is to make the host's runtime available inside the toolbox container and ensure that the binary always runs against it. Luckily, almost all supported containers have the host's /usr available at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to /run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing the path of the dynamic linker (ie., PT_INTERP) to the one inside /run/host. Unfortunately, there can only be one PT_INTERP entry inside the binary, so there must be a /run/host on the host too. Therefore, a /run/host symbolic link is created on the host that points to the host's /. [1] Commit 6ad9c63 containers#534 [2] glibc commit 035c012e32c11e84 https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84 https://sourceware.org/bugzilla/show_bug.cgi?id=23323 containers#821
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. However, the Go implementation, can run into ABI compatibility issues because binaries built on newer toolchains aren't meant to be run against older runtimes. The previous approach [1] of restricting the versions of the glibc symbols that are linked against isn't actually supported by glibc, and breaks if the early process start-up code changes. This is seen in glibc-2.34, which is used by Fedora 35 onwards, where a new version of the __libc_start_main symbol [2] was added as part of some security hardening: $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 __libc_start_main 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_detach 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_create 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_attr_getstacksize This means that /usr/bin/toolbox binaries built against glibc-2.34 on newer Fedoras fail to run against older glibcs in older Fedoras. Another option is to make the host's runtime available inside the toolbox container and ensure that the binary always runs against it. Luckily, almost all supported containers have the host's /usr available at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to /run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing the path of the dynamic linker (ie., PT_INTERP) to the one inside /run/host. Unfortunately, there can only be one PT_INTERP entry inside the binary, so there must be a /run/host on the host too. Therefore, a /run/host symbolic link is created on the host that points to the host's /. [1] Commit 6ad9c63 containers#534 [2] glibc commit 035c012e32c11e84 https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84 https://sourceware.org/bugzilla/show_bug.cgi?id=23323 containers#821
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. However, the Go implementation, can run into ABI compatibility issues because binaries built on newer toolchains aren't meant to be run against older runtimes. The previous approach [1] of restricting the versions of the glibc symbols that are linked against isn't actually supported by glibc, and breaks if the early process start-up code changes. This is seen in glibc-2.34, which is used by Fedora 35 onwards, where a new version of the __libc_start_main symbol [2] was added as part of some security hardening: $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 __libc_start_main 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_detach 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_create 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_attr_getstacksize This means that /usr/bin/toolbox binaries built against glibc-2.34 on newer Fedoras fail to run against older glibcs in older Fedoras. Another option is to make the host's runtime available inside the toolbox container and ensure that the binary always runs against it. Luckily, almost all supported containers have the host's /usr available at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to /run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing the path of the dynamic linker (ie., PT_INTERP) to the one inside /run/host. Unfortunately, there can only be one PT_INTERP entry inside the binary, so there must be a /run/host on the host too. Therefore, a /run/host symbolic link is created on the host that points to the host's /. [1] Commit 6ad9c63 containers#534 [2] glibc commit 035c012e32c11e84 https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84 https://sourceware.org/bugzilla/show_bug.cgi?id=23323 containers#821
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. However, the Go implementation, can run into ABI compatibility issues because binaries built on newer toolchains aren't meant to be run against older runtimes. The previous approach [1] of restricting the versions of the glibc symbols that are linked against isn't actually supported by glibc, and breaks if the early process start-up code changes. This is seen in glibc-2.34, which is used by Fedora 35 onwards, where a new version of the __libc_start_main symbol [2] was added as part of some security hardening: $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 __libc_start_main 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_detach 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_create 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_attr_getstacksize This means that /usr/bin/toolbox binaries built against glibc-2.34 on newer Fedoras fail to run against older glibcs in older Fedoras. Another option is to make the host's runtime available inside the toolbox container and ensure that the binary always runs against it. Luckily, almost all supported containers have the host's /usr available at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to /run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing the path of the dynamic linker (ie., PT_INTERP) to the one inside /run/host. Unfortunately, there can only be one PT_INTERP entry inside the binary, so there must be a /run/host on the host too. Therefore, a /run/host symbolic link is created on the host that points to the host's /. Based on ideas from Alexander Larsson and Ray Strode. [1] Commit 6ad9c63 containers#534 [2] glibc commit 035c012e32c11e84 https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84 https://sourceware.org/bugzilla/show_bug.cgi?id=23323 containers#821
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. However, the Go implementation, can run into ABI compatibility issues because binaries built on newer toolchains aren't meant to be run against older runtimes. The previous approach [1] of restricting the versions of the glibc symbols that are linked against isn't actually supported by glibc, and breaks if the early process start-up code changes. This is seen in glibc-2.34, which is used by Fedora 35 onwards, where a new version of the __libc_start_main symbol [2] was added as part of some security hardening: $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 __libc_start_main 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_detach 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_create 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_attr_getstacksize This means that /usr/bin/toolbox binaries built against glibc-2.34 on newer Fedoras fail to run against older glibcs in older Fedoras. Another option is to make the host's runtime available inside the toolbox container and ensure that the binary always runs against it. Luckily, almost all supported containers have the host's /usr available at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to /run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing the path of the dynamic linker (ie., PT_INTERP) to the one inside /run/host. Unfortunately, there can only be one PT_INTERP entry inside the binary, so there must be a /run/host on the host too. Therefore, a /run/host symbolic link is created on the host that points to the host's /. Based on ideas from Alexander Larsson and Ray Strode. [1] Commit 6ad9c63 containers#534 [2] glibc commit 035c012e32c11e84 https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84 https://sourceware.org/bugzilla/show_bug.cgi?id=23323 containers#821
The /usr/bin/toolbox binary is not only used to interact with toolbox
containers and images from the host. It's also used as the entry point
of the containers by bind mounting the binary from the host into the
container. This means that the /usr/bin/toolbox binary on the host must
also work inside the container, even if they have different operating
systems.
In the past, this worked perfectly well with the POSIX shell
implementation because it got intepreted by whichever /bin/sh was
available.
The Go implementation also mostly worked so far because it's largely
statically linked, with the notable exception of the standard C
library. However, recently glibc-2.32, which is used by Fedora 33
onwards, added a new version of the pthread_sigmask symbol [1] as part
of the libpthread removal project:
This means that /usr/bin/toolbox binaries built against glibc-2.32 on
newer Fedoras pick up the latest version of the symbol and fail to run
against older glibcs in older Fedoras.
One way to fix this is to disable the use of any C code from Go by
using the CGO_ENABLED environment variable [2]. However, this can
negatively impact packages like "os/user" [3] and "net" [4], where the
more featureful glibc APIs will be replaced by more limited
equivalents written only in Go.
Instead, since glibc uses symbol versioning, it's better to tell the
Go toolchain to avoid linking against any symbols from glibc-2.32.
This was accomplished by a few linker tricks:
The GNU ld linker's --wrap flag was used when building the Go code
to divert pthread_sigmask invocations from Go to another function
called __wrap_pthread_sigmask.
A static library was added to provide this __wrap_pthread_sigmask
function, which forwards calls to the actual pthread_sigmask API in
glibc. This library itself was not linked with --wrap, and
specifies the latest permissible version of the pthread_sigmask
symbol from glibc for each architecture. Currently, the list of
architectures covers the ones that Fedora builds for.
The Go cmd/link linker was switched to external mode [5]. This
ensures that the final object file containing all the Go code gets
linked to the standard C library and the wrapper static library by
the GNU ld linker for the --wrap flag to kick in.
Based on ideas from Ondřej Míchal.
[1] glibc commit c6663fee4340291c
https://sourceware.org/git/?p=glibc.git;a=commit;h=c6663fee4340291c
[2] https://golang.org/cmd/cgo/
[3] https://golang.org/pkg/os/user/
[4] https://golang.org/pkg/net/
[5] https://golang.org/src/cmd/cgo/doc.go
#529